Picture for Christopher Summerfield

Christopher Summerfield

Reward Models Inherit Value Biases from Pretraining

Add code
Jan 28, 2026
Viaarxiv icon

Can AI mediation improve democratic deliberation?

Add code
Jan 09, 2026
Viaarxiv icon

Reward Model Interpretability via Optimal and Pessimal Tokens

Add code
Jun 08, 2025
Figure 1 for Reward Model Interpretability via Optimal and Pessimal Tokens
Figure 2 for Reward Model Interpretability via Optimal and Pessimal Tokens
Figure 3 for Reward Model Interpretability via Optimal and Pessimal Tokens
Figure 4 for Reward Model Interpretability via Optimal and Pessimal Tokens
Viaarxiv icon

HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics

Add code
May 08, 2025
Viaarxiv icon

Increasing happiness through conversations with artificial intelligence

Add code
Apr 02, 2025
Viaarxiv icon

Language Agents as Digital Representatives in Collective Decision-Making

Add code
Feb 13, 2025
Figure 1 for Language Agents as Digital Representatives in Collective Decision-Making
Figure 2 for Language Agents as Digital Representatives in Collective Decision-Making
Figure 3 for Language Agents as Digital Representatives in Collective Decision-Making
Figure 4 for Language Agents as Digital Representatives in Collective Decision-Making
Viaarxiv icon

Flexible task abstractions emerge in linear networks with fast and bounded units

Add code
Nov 06, 2024
Figure 1 for Flexible task abstractions emerge in linear networks with fast and bounded units
Figure 2 for Flexible task abstractions emerge in linear networks with fast and bounded units
Figure 3 for Flexible task abstractions emerge in linear networks with fast and bounded units
Figure 4 for Flexible task abstractions emerge in linear networks with fast and bounded units
Viaarxiv icon

Early learning of the optimal constant solution in neural networks and humans

Add code
Jun 25, 2024
Figure 1 for Early learning of the optimal constant solution in neural networks and humans
Figure 2 for Early learning of the optimal constant solution in neural networks and humans
Figure 3 for Early learning of the optimal constant solution in neural networks and humans
Figure 4 for Early learning of the optimal constant solution in neural networks and humans
Viaarxiv icon

Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem

Add code
Apr 23, 2024
Figure 1 for Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem
Figure 2 for Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem
Figure 3 for Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem
Figure 4 for Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem
Viaarxiv icon

Regularised neural networks mimic human insight

Add code
Feb 22, 2023
Figure 1 for Regularised neural networks mimic human insight
Figure 2 for Regularised neural networks mimic human insight
Figure 3 for Regularised neural networks mimic human insight
Figure 4 for Regularised neural networks mimic human insight
Viaarxiv icon